Medical Image Analysis — Latest Matching Preprints

1

Fully Homomorphic Collaborative Learning for Safe Cross-Healthcare Institution Development and Implementation of Foundation Models

Bian, S.; Qiao, H.; Yan, T.; Xia, Z.; Gao, X.; Xu, Y.; Shen, R.; Ma, T.; Guan, Z.; Wang, Y. X.; Wong, T. Y.; Dai, Q.

2026-05-20 ophthalmology 10.64898/2026.05.15.26353345 medRxiv

Top 0.1%

22.6%

Show abstract

Foundation models (FMs) are powerful tools to allow the broad clinical application of artificial intelligence (AI) in healthcare systems, offering adaptability to different disease, modalities and clinical settings. However, FMs require large-scale datasets to train and fine-tune, while most real-world data are localized in siloed healthcare settings with strict data privacy protection, a restriction that poses a fundamental challenge in the cross-healthcare institution development of FMs. Here, we develop a fully homomorphic collaborative learning framework, named as FOCAL, that enables secure FM-driven diagnosis without exposing raw patient information. Different from traditional federated learning (FL) frameworks that aggregate locally trained models, FOCAL integrates fully homomorphic encryption (FHE) with split training to effectively execute collaborative learning completely over encrypted data. Specifically, we apply FOCAL on different types of retinal and pathology FMs to demonstrate its clinical performance. When facing gradient inversion attacks, FOCAL reduced the data leakage rate from 90.6% to 0% with comparable accuracy performance of the state-of-the-art FL paradigms, owing to the provable security provided by FHE. Moreover, under the same level of security, FOCAL can boost the macro-average AUROC by nearly 50% (from 0.5202 to 0.9831) when evaluated against fully encrypted FL models. In the multi-institution comparative experiments, FOCAL consistently outperforms all single-institution FMs, improving AUROCs by 9.62% and 14.46% on the ocular disease diagnosis and severity classification, respectively. Lastly, external validations on both retinal and pathology FMs further verified the accuracy and security advantages of FOCAL and highlighted its reliable interpretability and generalizability for cross-institution clinical development and implementation of FMs. FOCAL is a novel method to build a secure data-sharing AI community, facilitating healthcare institutions to benefit from and contribute to next-generation FMs development without compromising patient privacy and data security.

2

FiberLM: A Transformer-Based Model for Mouse Brain Diffusion MRI Tractography Guided by Viral Tracer Data

Wen, R.; Zhang, J.; Liang, Z.

2026-05-11 neuroscience 10.64898/2026.05.06.723316 medRxiv

Top 0.1%

18.9%

Show abstract

Diffusion MRI (dMRI) tractography provides a non-invasive method for mapping whole-brain structural connectivity. However, its application is limited by substantial false-positive and false-negative connections. While deep learning based methods have shown promise in improving tractography, most rely on training data derived from conventional dMRI tractography, therefore inheriting the same limitations. Here, we introduce FiberLM, an attention-based Transformer model for mouse brain tractography. The model was trained using a whole-brain streamline dataset based on viral tracer data from the Allen Mouse Brain Connectivity Atlas (AMBCA), allowing the model to learn the properties of both local and long-range axonal trajectories through self-attention. FiberLM was applied to predict anatomically plausible axonal trajectories from ex vivo high-resolution mouse brain dMRI data. Quantitative evaluations demonstrated that FiberLM significantly reduced false-positive and false-negative connections, improved spatial agreement with tracer-defined pathways, and generated whole-brain connectomes that more closely approximated AMBCA results compared to conventional tractography. These findings suggest FiberLM as a potential tool for accurate reconstruction of mouse brain structural connectomics.

3

Toward CT-based Tractography: Presurgical White Matter Tract Mapping in Intracerebral Hemorrhage

Huang, G.; Xie, G.; Li, Y.; Wang, Q.; Yao, S.; Tan, Y.; Kikinis, R.; Golby, A. J.; O'Donnell, L. J.; Zhang, F.

2026-05-21 neuroscience 10.64898/2026.05.18.724202 medRxiv

Top 0.1%

15.2%

Show abstract

Presurgical mapping of key white matter (WM) fiber tracts is crucial for intracerebral hemorrhage (ICH) surgery, but it currently relies on tractography from diffusion MRI (dMRI), which has limited applicability in urgent or resource-constrained settings due to long scan times and limited MRI availability. To bridge this gap, we developed a deep learning approach designed to reconstruct critical fiber tracts directly from routinely acquired CT scans, focusing on the corticospinal tract (CST) due to its high clinical relevance. By training a novel network on a curated dataset of 150 paired CT-dMRI scans (101 ICH patients and 49 healthy subjects), we enabled the direct mapping of the CST from CT images alone, bypassing the traditional requirement for dMRI. Our results demonstrate that the CT-derived tracts achieve high anatomical plausibility, with neurosurgeon expert assessments yielding a Likert score of 3.64. Furthermore, the clinical relevance of these reconstructions was validated by a significant correlation between CT-derived tract integrity and patient motor scores (r = 0.726, p = 3.731x 10-7). These findings suggest that in complex clinical scenarios, particularly where dMRI signal quality is compromised by lesion-induced distortion, this CT-based mapping may serve as a useful anatomical reference. Overall, by enabling CT-based WM mapping, our method potentially offers a practical route to gain the key advantages of tractography in resource-limited or time-critical settings

4

Replicability of unsupervised deep learning derived image phenotypes

Xia, T.; ISLAM, S. M. S.; Xie, Z.; Zhao, X.; Zhi, D.

2026-05-19 bioinformatics 10.64898/2026.05.19.726257 medRxiv

Top 0.1%

14.5%

Show abstract

Unsupervised deep-learning image phenotypes derived from brain MRI are propelling imaging genetics to link brain structure to genetic variation. However, their replicability across data sets has not been sufficiently evaluated, raising questions about whether they capture robust biological structure or reflect training-specific artifacts. Here, we assess the replicability of unsupervised deep-learning image phenotypes under variation in model initialization, data partitioning, and cohort, directly evaluating their stability across experimental conditions. We trained multiple models under (i) different training batch random seeds, (ii) cross-validation splits, and (iii) independent datasets (UKB and ADNI), across CNN and ViT architectures. We then derived representations from a separate UKB discovery cohort (N = 22,985) for both trained models and random initialized models without training. The representation stability was assessed using centered kernel alignment (CKA; mean ViT 0.74 vs random 0.27) and kernel canonical correlation analysis (KCCA; mean ViT 0.84 vs random 0.60), as well as genetic discovery stability using loci overlap ratio (mean ViT 0.45 vs random 0.08). We further applied weighted MAXVAR generalized CCA to 12 embeddings to extract a shared 30-dimensional subspace. Our result showed that UDIPs exhibit statistically significant stability (CKA, KCCA t test p < 0.001) across training perturbations and preserve biologically meaningful structure (loci overlap ratio t test p <0.001) across cohorts, supporting their use in imaging genetics.

5

High Resolution Multi-depth Quantification of the Retinal Nerve Fiber Layer

Callet, C.; Bertrand, M.; Guzman, K.; Mece, P.; Rossi, E. A.; Grieve, K.

2026-06-01 ophthalmology 10.64898/2026.05.22.26353127 medRxiv

Top 0.1%

12.6%

Show abstract

The retinal nerve fiber layer, composed of axon bundles converging toward the optic nerve, is a key biomarker for diagnosing and monitoring glaucoma and other neurodegenerative diseases. High-resolution en face imaging of individual nerve fiber bundles offers morphological information beyond what conventional optical coherence tomography provides, yet clinical integration remains limited by the lack of automated analysis tools and normative data. Here, we imaged 14 healthy volunteers using time-domain full-field optical coherence tomography and adaptive optics scanning laser ophthalmoscopy, and developed automated pipelines to quantify bundle width, trajectory, tortuosity, and orientation. Bundles were on average 25% wider at shallower retinal depths, width measurements were consistent across imaging modalities, and estimated axon count per bundle decreased significantly with age. Global trajectory analysis revealed systematic deviations of high resolution data from existing mathematical models, particularly in the temporal sector, leading us to propose two refined trajectory models. These normative results provide a foundation for high resolution biomarkers for use in investigations of retinal neurodegeneration.

6

Anatomy-Guided 3D Graph Networks for Couinaud Segmentation in Tumor Affected Livers

You, L.; Dang, H.; Wang, H.; Matta, E.; zhou, X.

2026-05-14 bioinformatics 10.64898/2026.05.11.724316 medRxiv

Top 0.1%

10.1%

Show abstract

Image-based liver Couinaud segmentation is designed to automatically provide the locations of suspicious objects in liver CT/MR images. Once achieved, the physicians will be guided to the target slice and area where the suspicious node is located. However, conventional algorithms trained primarily on healthy liver images often fail to generalize to Hepatocellular Carcinoma (HCC) cases due to pathological structural distortions. In this work, we propose a robust two-stage framework that integrates a 3D Unet with a 3D Anatomical Structure-Guided Graph Convolutional Network (3D GCN). This two-stage strategy effectively isolates the liver volume to eliminate structural noise from neighboring organs, such as the spleen, allowing the framework to focus exclusively on the complex 3D anatomical relationships among the eight segments. To ensure the topological consistency required for global spatial reasoning, we implement a standardized preprocessing pipeline that normalizes liver-only volumes to exactly 50 frames along the z-axis. By combining a lightweight 3D UNet backbone with the 3D GCN for refined boundary reasoning, our model demonstrates superior generalization performance on unseen clinical datasets, achieving a mean Dice score of 0.828 in blind testing. By releasing our code and pretrained weights, we aim to provide the first publicly available deep learning resource for robust Couinaud segmentation.

7

Anatomically constrained and curated cerebellar tractography (ACCURAT): an open framework and a pathway-specific neuroanatomical reference

Legarreta, J. H.; Rushmore, R. J.; Yeterian, E. H.; Makris, N.; Rathi, Y.; O'Donnell, L. J.

2026-05-06 neuroscience 10.64898/2026.05.02.722413 medRxiv

Top 0.1%

10.0%

Show abstract

Cerebellar pathways form extensive structural circuits linking the cerebellum with the brainstem, thalamus, and cerebrum, underlying motor, cognitive, and affective functions. Diffusion MRI tractography provides the only non-invasive method for mapping these pathways in vivo, but reconstruction of cerebellar connectivity remains challenging due to crossing fibers, peduncular bottlenecks, decussations, multi-synaptic circuits, and numerous small nuclei that define pathway origins and terminations. Here we introduce Anatomically Constrained and CURAted Tractography (ACCURAT), an open framework for reconstructing cerebellar pathways from diffusion MRI using anatomical priors and rule-based streamline queries. ACCURAT combines anatomical segmentation, densely seeded tractography, and vertex-level evaluation of anatomical constraints along streamline trajectories, enabling the isolation of pathway segments within specific nuclei while preventing their propagation across synaptic boundaries. To define these constraints, we provide a concise, pathway-by-pathway synthesis of cerebellar connectional anatomy based on experimental tract-tracing literature and organized for tractography applications. We identify pathway-specific origins, trajectories, terminations, decussation patterns, and tractography challenges, and use this information to inform tractography-ready cerebellar pathway definitions. Using ultra-high-resolution submillimeter diffusion MRI (0.76 mm gSlider acquisition) from healthy participants, we reconstruct multiple extrinsic and intrinsic cerebellar pathways, including specific components of the inferior, middle, and superior cerebellar peduncles; challenging decussating pathways such as the olivocerebellar and dentato-olivary projections; and intrinsic cerebellar pathways, including Purkinje corticonuclear projections and intracortical parallel fibers. ACCURAT generalizes across tractography algorithms, producing comparable reconstructions with both probabilistic parallel transport tractography and deterministic unscented Kalman filter tractography. Together, the ACCURAT framework and accompanying neuroanatomical reference provide an anatomically grounded, tractography-oriented resource for reconstructing cerebellar pathways in vivo and for supporting future development and evaluation of cerebellar tractography methods.

8

Cortical reconstruction and anatomical parcellation of high-resolution multi-modal postmortem ex vivo MRI of the human infant brain

Khandelwal, P.; Young, S.; Xi Ngo, N.; Yushkevich, P. A.; van der Kouwe, A.; Haynes, R. L.; Kinney, H. C.; Zollei, L.

2026-05-09 neuroscience 10.64898/2026.05.07.722301 medRxiv

Top 0.1%

10.0%

Show abstract

High-resolution postmortem (ex vivo) magnetic resonance imaging enables detailed examination of brain anatomy at spatial scales not achievable in vivo and provides a unique opportunity to link morphometric measurements with the underlying pathology. Despite these advantages, robust computational tools for automated anatomical segmentation and cortical surface reconstruction remain limited, particularly in postmortem infant brains. Incomplete myelination, thinner cortical ribbons, small-scale neuroanatomy, as well as an evolving tissue contrast combined with fixation-induced signal alterations and variability in postmortem preparation make standard neuroimaging pipelines unusable for postmortem infant MRI. In this work, we introduce a one-of-its-kind multi-modal high-resolution postmortem infant MRI dataset and a unified computational framework that combines deep learning-based volumetric segmentation with surface-based cortical reconstruction and anatomical parcellation in native subject space resolution. To address the pronounced domain shift inherent to postmortem MRI, we develop a postmortem-specific synthetic data generation engine (PostSynth) that explicitly models fixation-driven postmortem imaging characteristics. In particular, we incorporate postmortem-specific altered gray-white matter contrast, laminar cortical intensity heterogeneity, specimen-specific bias fields, and background signal characteristics associated with immersion media: phenomena not typically observed in in vivo data or captured by generic contrast-agnostic synthesis methods. We benchmark our framework against a set of widely used contrast-agnostic and foundational brain segmentation models, demonstrating improved anatomical consistency and segmentation performance in high-resolution postmortem infant data. The code is publicly available as part of the purple-mri package.

9

Generating Synthetic MR Perfusion Maps from DWI and FLAIR in Acute Ischemic Stroke: Development and External Validation of a Deep Learning Model

Matsulevits, A.; Koch, A.; Mahe-Verdure, C.; Bendszus, M.; Hilbert, A.; Boullet, M.; Marnat, G.; Mutke, M.; Aydin, O.; Olindo, S.; Sibon, I.; Frey, D.; Thiebaut de Schotten, M.; Tourdias, T.

2026-05-13 neuroscience 10.1101/2025.10.23.684079 medRxiv

Top 0.2%

8.3%

Show abstract

BackgroundMagnetic resonance imaging (MRI) is critical for acute stroke triage, but time-consuming, and often requires contrast injection for perfusion imaging. This study aimed to synthesize T-map perfusion maps from routinely available, non-contrast DWI and FLAIR using deep generative models. We hypothesized that relevant perfusion information could be inferred from these modalities to streamline imaging and reduce reliance on dynamic susceptibility contrast perfusion. MethodsAcute MRI data from 355 patients with anterior circulation stroke, including dynamic susceptibility contrast perfusion, were retrospectively collected from two European centers (Heidelberg: 2010-2018; Bordeaux: 2021-2022). Six versions of a denoising diffusion probabilistic model (DDPM) and a GAN architecture were trained to generate synthetic T-max perfusion maps from DWI, FLAIR, and infarct core mask as inputs. Performance was assessed by comparing synthetic and ground truth T-max maps using image similarity metrics. Regions with T-max >6s were compared using Dice coefficients, and mismatch volume distributions were analyzed. An ablation study quantified the contribution of each input. ResultsThe best performance was achieved by a DDPM with a 2.5D architecture using DWI, FLAIR, infarct core mask, and a perfusion-weighted loss function. It produced synthetic perfusion T-max maps with high similarity to ground truth under 110 seconds. The model showed strong spatial overlap for T-max >6s regions in internal validation (average Dice = 0.82, SD = 0.08), and external validation average (Dice 0.59, SD = 0.13), respectively. Synthetic maps closely matched ground-truth mismatch distributions, capturing key perfusion patterns. The infarct core mask played a critical role in model performance, alongside DWI and FLAIR inputs. ConclusionsWe propose a non-invasive, scalable framework to generate synthetic T-max perfusion maps from non-contrast MRI. This approach could expand access to perfusion data in acute stroke, shorten imaging protocols, and accelerate treatment decisions by eliminating the need for contrast-enhanced acquisition. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=200 SRC="FIGDIR/small/684079v2_ufig1.gif" ALT="Figure 1"> View larger version (94K): org.highwire.dtl.DTLVardef@164235forg.highwire.dtl.DTLVardef@14e5489org.highwire.dtl.DTLVardef@190214eorg.highwire.dtl.DTLVardef@17a9e3a_HPS_FORMAT_FIGEXP M_FIG C_FIG

10

Assessing Foundation Models for Computational Pathology in Endometrial Cancer

Volinsky-Fremond, S.; van den Berg, N.; Barkey Wolf, J.; Schoenpflug, L. A.; Andani, S.; Ortoft, G.; Jobsen, J. J.; Lutgens, L. C.; Powell, M. E.; Mileshkin, L. R.; Mackay, H.; Leary, A.; Razack, R. R.; de Bruyn, M.; de Boer, S. M.; Nout, R. A.; Smit, V. T.; Creutzberg, C. L.; Koelzer, V. H.; Bosse, T.; Horeweg, N.

2026-05-25 pathology 10.64898/2026.05.22.26353897 medRxiv

Top 0.2%

8.3%

Show abstract

Computational pathology leverages deep learning to extract clinically relevant information from digitized tumor slides, predicting histopathological subtypes, molecular alterations, and patient outcomes. Recent pipelines increasingly rely on foundation models trained on large pan-cancer datasets to generate generalizable features. In endometrial cancer (EC), their comparative performance for clinical diagnostic tasks remains unexplored. For the first time, this study evaluates the performance of seven state-of-the-art foundation models across morphological, molecular, and prognostic tasks using a large EC dataset of 3,293 patients from randomized trials and clinical cohorts. In addition, their performance was compared to one model (EsVIT) exclusively trained on EC. The foundation models H-OPTIMUS-0, CONCH, and VIRCHOW2, achieved the highest mean performance, but the best-performing foundation model varied by task. The top-performing foundation model outperformed the EC-specific feature extractor EsVIT across all tasks. This study highlights the superiority of foundation models over a domain-specific feature extractor in EC. Selecting the optimal foundation model for novel tasks remains challenging due to performance plateaus and limited information on the training datasets, requiring rigorous benchmarking and domain insight to reach maximum potential.

11

A Unified Form of Batch Harmonization Equation for Normative Modeling: A Location Scale Framework

Li, M.; Wang, Y.; Shen, Y.; Jia, G.

2026-05-20 bioengineering 10.64898/2026.05.17.725713 medRxiv

Top 0.2%

8.1%

Show abstract

Normative modeling quantifies individual deviation from population norms by estimating the conditional mean and variance of brain-derived measures as functions of clinically relevant parameters such as age. The rapid growth of multicenter consortia has created an urgent need for normative models that incorporate batch harmonization. Several harmonization methods based on linear mixed models--ComBat, GAMLSS, HBR, and Generalized Normative Modeling (GNM)--offer explicit formulations of the mean and variance, making them natural candidates for batch-harmonized normative modeling; yet the absence of a unified theoretical framework leaves it unclear whether and how these methods support the computation of batch-harmonized z-scores. We bridge this gap by writing existing harmonization methods as special cases of a single location-scale equation, y = m(x, {Theta})+{sigma}(x, {Theta}){varepsilon} , which we term the unified form of batch harmonization equation for normative modeling. The methods differ only in the functional forms of m and{sigma} , how batch parameters enter{Theta} , and how{Theta} is estimated. This unified form yields both harmonized data y* and site-invariant z-scores from the same model, providing a common theoretical language for harmonized normative modeling. Building on this framework, we evaluate the underlying regression engines (parametric, spline, Gaussian process, kernel, deep learning), sensitivity to outliers, computational scalability, and federated decomposability for privacy-preserving multi-center computation. By clarifying what each method assumes, what it delivers, and where the boundaries of current methodology lie, the unified equation establishes a principled foundation for method selection and charts a path toward reliable, scalable, and privacy-aware normative modeling across multi-center neuroimaging.

12

From 3D Time-of-Flight Angiography to Accelerated 4D Arterial Spin Labeling Angiography: A Fast Few-Shot Transfer Learning Approach

Li, H.; Dragonu, I.; Jezzard, P.; Okell, T. W.; Chiew, M.

2026-05-20 neuroscience 10.64898/2026.05.18.725892 medRxiv

Top 0.2%

6.6%

Show abstract

PurposeTo develop a data-efficient deep learning framework for rapid reconstruction of highly accelerated 4D arterial spin labeling (ASL) magnetic resonance angiography (MRA) with robust generalization using extremely limited acquired data, addressing the challenges of prolonged acquisition and reconstruction time. MethodsA simulation-driven, few-shot transfer learning approach was adopted by leveraging publicly available 3D time-of-flight (TOF)-MRA data to generate realistic multi-coil complex-valued pseudo-ASL k-space datasets for large-scale pre-training. A 3D unrolled reconstruction network was trained on this simulated data using a histogram-weighted loss and subsequently extended to 4D using lightweight temporal fusion modules. Fine-tuning was performed using only two experimentally acquired 4D ASL-MRA datasets. The method was evaluated on retrospectively and prospectively undersampled Cartesian 4D ASL-MRA data acquired at 3T and compared with compressed sensing (CS) and locally low-rank (LLR) reconstructions. ResultsThe proposed method achieved superior reconstruction quality compared with CS and LLR, with improved vessel depiction, particularly in distal branches, and enhanced temporal fidelity. Quantitative evaluation demonstrated higher vessel-masked peak signal-to-noise ratio and structural similarity index measure, along with increased error entropy, indicating reduced noise and structured artifacts. The initial pre-trained model already outperformed conventional methods, while additional 4D fine-tuning further improved performance. Robust reconstruction was demonstrated in prospectively undersampled data and multi-slab acquisitions, enabling large-coverage, time-resolved angiography within clinically feasible scan times (4-6 min). ConclusionsSimulation-driven pre-training combined with few-shot fine-tuning enables accurate and rapid reconstruction of highly accelerated 4D ASL-MRA in data-limited settings. The proposed framework provides a practical pathway toward clinically feasible, non-contrast dynamic cerebrovascular imaging.

13

Reliable Uncertainty Under Class Imbalance and Distribution Shift: Class-Conditional Conformal Prediction of Multiple Sclerosis

Millar, A. S.; Roman, C.; Gouripeddi, R.; Facelli, J. C.

2026-05-15 health informatics 10.64898/2026.05.12.26353057 medRxiv

Top 0.2%

6.4%

Show abstract

Objectives To evaluate whether class-conditional conformal prediction (CP) can provide reliable uncertainty quantification (UQ) under severe class imbalance and distribution shift, using multiple sclerosis (MS) diagnosis from magnetic resonance imaging (MRI) as a clinical exemplar. Methods We evaluated marginal and class-conditional CP using 720 T2-weighted MRI scans (142 MS, 578 controls). A convolutional neural network trained on 3 T data was evaluated under distribution shift (1.5 T acquisitions and synthetic image degradations). Through 100 Monte Carlo experiments, we assessed coverage guarantees, class-specific performance, and relationships between calibration set size, coverage variance, and uncertainty. Results Marginal CP severely under-covered the minority MS class (16.9% mean coverage at 1.5 T vs. 95.2% for controls) despite valid population-level guarantees. Class-conditional CP dramatically improved MS coverage to 77.5% at 1.5 T and 85.8% at 3 T, significantly reducing severe undercoverage (<80%) frequency while maintaining >89% control coverage. Minority class coverage variance increased due to limited calibration samples, matching theoretical Beta-binomial predictions. CP maintained validity under distribution shift; prediction set sizes scaled monotonically with shift severity, yielding clinically interpretable UQ. Conclusions Class-conditional CP successfully mitigates systematic undercoverage of minority disease classes while maintaining validity under distribution shift. The approach offers a practical, model-agnostic solution for uncertainty quantification applicable across clinical AI systems, though increased coverage variance for less represented conditions reflects fundamental statistical constraints. By characterizing these variance trade-offs, this framework enables more reliable deployment of diagnostic AI in heterogeneous clinical environments across diverse medical domains where minority disease class detection is critical.

14

A Comparative Evaluation of Structural MRI Foundation Models for Brain Age Regression and Sex Classification

Encin, A.; Gilmore, A.; Rokem, A.; Dickie, E.; Glatard, T.

2026-05-19 neuroscience 10.64898/2026.05.15.725427 medRxiv

Top 0.2%

6.4%

Show abstract

Foundation models pre-trained on large neuroimaging datasets offer a promising approach to overcome the limited sample sizes typical of mental health imaging studies, yet their generalization across diverse clinical populations remains unclear. We present the first systematic benchmark of four publicly available structural MRI foundation models -- AnatCL, BrainIAC, 3D-Neuro-SimCLR, and SwinBrain -- on tasks relevant to mental health research. Using T1-weighted MRI from Parkin-sons Progression Markers Initiative (PPMI), Healthy Brain Network (HBN), and Nathan Kline Institute (NKI), we evaluate these models on sex classification, brain age prediction, and Parkinsons disease (PD) classification, benchmarking against models trained from FreeSurfer-derived cortical thickness and cortical surface area features, as well as an un-trained CNN baseline. Although some individual foundation models out-performed FreeSurfer on particular tasks and datasets, 3D-Neuro-SimCLR demonstrated the most consistent performance overall, with the notable exception of HBN sex classification, and all models failed to classify early-stage Parkinsons disease above chance. Notably, untrained CNNs achieved performance comparable to or exceeding FreeSurfer in multiple instances, establishing them as computationally efficient reference models. The cross-model feature correlation analysis reveals that foundation model representations correlate differently with traditional cortical measurements. These findings position structural MRI foundation models, particularly 3D-Neuro-SimCLR and AnatCL, as promising avenues to boost the performance of neuroimaging predictive models in mental health.

15

Evaluating OCT Device-Reported Image Quality Score: Towards a Task-Specific Quality Gate for Deep Learning-based Outer-Retina and Choroid Boundary Segmentation

Gadari, A.; Vichare, A. A.; Corona, F.; Vupparaboina, S. C.; Lall, S. R.; Gregori, G.; Hasan, N.; Sahel, J.-A.; Chhablani, J.; Bollepalli, S. C.; Vupparaboina, K. K.

2026-05-20 ophthalmology 10.64898/2026.05.17.26353399 medRxiv

Top 0.2%

6.4%

Show abstract

Manufacturer-defined signal-strength indices are frequently employed as quality benchmarks for automated optical coherence tomography analysis, yet their empirical relationship with deep learning segmentation accuracy remains unclear. Because these metrics were originally developed for conventional image-processing pipelines, their ability to predict modern model-based segmentation accuracy has not been empirically validated. To address this gap, we evaluated the Heidelberg Spectralis Q-score against U-Net segmentation performance across 5,047 B-scans from 103 eyes for three anatomical boundaries of the posterior segment of the eye: the Ellipsoid Zone (EZ), Bruch's Membrane (BM), and Choroid Outer Boundary (COB). Alongside standard boundary agreement metrics (MAE, MSE, Dice Similarity Coefficient), we adapted the Earth Mover's Distance (EMD) from optimal transport theory as a boundary evaluation metric. Unlike column-wise averages, EMD quantifies boundary agreement as a 2-D geometric displacement, directly measuring residual spatial displacement between the model segmented boundary and the ground-truth boundary. Our results demonstrate that the Q-score - originally designed to gate image-processing-based automated analysis - is a poor predictor of deep learning boundary segmentation accuracy, with explained variance (R2) failing to exceed 1.4% across all three boundaries. We further observed a monotonically increasing error hierarchy with anatomical depth (EZ < BM < COB), consistent across metrics, which is unexplained by the signal strength. At the COB, correlations were paradoxically positive, explained by a B-scan-level mediation chain: higher Q-scores correspond to greater choroidal thickness (r=0.113, {rho}=0.158), which in turn predicts higher COB segmentation error (r=0.165, {rho}=0.191) - a localization difficulty that global signal strength cannot capture. Collectively, these findings challenge the implicit assumption that signal-strength-based quality thresholds are a reliable proxy for deep learning model performance, and motivate a shift toward task-specific acquisition quality criteria calibrated to model performance rather than signal interpretability.

16

Deep learning optimisation for cardiology: Neural Architecture Search-driven arrhythmia classification with electrocardiograms

Vanegas Mueller, E.; Joe-Oshodi, A.; Banerjee, A.; Villarroel, M.

2026-05-30 cardiovascular medicine 10.64898/2026.05.28.26354348 medRxiv

Top 0.2%

6.4%

Show abstract

Cardiovascular disease is the leading cause of death worldwide. Sudden cardiac death (SCD) accounts for roughly 50% of all cardiac deaths. The electrocardiogram (ECG) is widely used for early diagnosis of cardiac disease. However, the complexity of accurate interpretation limits the ECG's efficacy. Modern deep learning methods have been applied to assist clinicians in diagnosis. We applied Neural Architecture Search (NAS), an automated machine learning technique, to identify optimal deep learning architectures for classifying cardiac arrhythmias from ECGs. We applied the Differentiable Architecture Search strategy to an AutoFormer search space to identify optimal self-attention architectures for arrhythmia classification. We trained, validated, and tested the resulting model on the PhysioNet Challenge 2021 dataset (n = 88,253), comprising ECGs across three continents. We performed a hyperparameter optimisation on the NAS output, exploring input patch size, class weighting, and loss function. We evaluated performance using the PhysioNet Challenge metric and the area under the receiver operating characteristic curve (AUROC). The NAS converged towards minimal architectural configurations (embedding dimension: 384, depth: 4, self-attention heads: 4, MLP ratio: 1) with a validation challenge metric of 0.66 (PhysioNet Challenge 21 Winner: 0.63). The NAS-created network achieved an AUROC of 0.97 and a challenge metric of 0.71 during testing. Normal Sinus Rhythm and Sinus Tachycardia achieved AUROCs of 0.99. Low-QRS Voltage and T-wave abnormality were the worst-performing arrhythmias, with AUROCs of 0.89 and 0.90, respectively. We interpret that architectural simplicity drives performance in arrhythmia classification. Because SCD is unexpected, prevention strategies in free-living environments require lightweight computational resources suitable for wearable devices. Class imbalance fundamentally limits classification performance for rare arrhythmias such as Low-QRS Voltage and T-wave inversion, irrespective of hyperparameter choices. However, the self-attention mechanism can autonomously abstract clinical representations, simplifying clinical deployment by eliminating the need for an explicit feature-extraction pipeline.

17

Deep Computational Anatomy via Latent-Aligned Multiview Normalizing Flows

Tustison, N. J.; Avants, B. B.; Cook, P. A.; Gee, J. C.; Stone, J. R.

2026-05-10 bioinformatics 10.64898/2026.05.05.723039 medRxiv

Top 0.2%

6.3%

Show abstract

In modeling complex probability distributions, normalizing flows provide exact-likelihood, bijective mappings between empirical data and tractable latent spaces. Building on this foundation, latent-aligned multiview normalizing (LAMNr) flows leverage these salient properties to learn shared latent subspaces across heterogeneous, multimodal datasets while simultaneously topologically unfolding the sampled data manifold into a continuous vector space. Formal latent-alignment constraints are used to model shared structural features separate from view-specific variations, coordinating latent projections into a shared geometric subspace. By applying this transformation in the context of biological imaging, the framework establishes a potential basis for a deep learning interpretation of foundational computational anatomy concepts, such as the population template, latent distances, and geodesic pairwise image interpolation. Additionally, the proposed framework enables closed-form conditional modeling for exact cross-view imputation and other latent space manipulations. Evaluations and illustrations on both imaging-derived phenotypes (IDPs) and multimodal MRI demonstrate the proposed framework and potential applications. To further motivate our work, we provide a robust and comprehensive, 2D and 3D open-source implementation in PyTorch, natively integrated with the ANTsX ecosystem (i.e., ANTsTorch) for efficient training and subsequent data transformation, manipulation, and analysis.

18

SUITPy: A Python-based toolbox for the analysis of cerebellar functional and anatomical imaging data across the human lifespan

Wang, Y.; Li, Y.; Arafat, B.; Ashkanichenarlogh, V.; Nettekoven, C. R.; Pinho, A. L.; Hernandez-Castillo, C.; Marquand, A. F.; Diedrichsen, J.

2026-05-18 neuroscience 10.64898/2026.05.14.724397 medRxiv

Top 0.2%

6.2%

Show abstract

The human cerebellum plays a central role in motor, emotional, and cognitive functions, and is implicated in many brain disorders. To improve the analysis of functional and anatomical imaging from the cerebellum, we introduce SUITPy, an improved and fully revised Python implementation of the widely used SUIT toolbox. For this new version, we developed a U-Net based model to automatically isolate the cerebellum from adjacent cortical tissue, which achieves higher fidelity than existing algorithms. The isolation works robustly without manual corrections for imaging data across the lifespan. We show that isolation and subsequent normalization to a cerebellum-only template lead to a more precise alignment of cerebellar structures across participants compared to normalization using a whole-brain template. We also show the utility of the cerebellar mask to prevent contamination of cerebellar functional data from surrounding cortical structures. The toolbox also provides functionality for visualizing cerebellar data on a flatmap, along with a range of anatomical and functional cerebellar atlases, thereby offering an essential tool that enables accurate cerebellar analysis across the lifespan.

19

Gene-Modulated Network Diffusion for Improved Modeling of Amyloid-β Spread in Alzheimer's Disease

Xu, F. H.; Duong-Tran, D.; Huang, H.; Saykin, A. J.; Thompson, P. M.; Davatzikos, C.; Zhao, Y.; Shen, L.

2026-05-07 bioinformatics 10.64898/2026.05.04.722725 medRxiv

Top 0.2%

5.1%

Show abstract

Understanding the pathogenesis of amyloid-{beta} pathology in Alzheimers Disease (AD) proves to be a challenge. In this work, we expand upon the application of network diffusion models (NDM) to study pathophysiological spread of amyloid-{beta} throughout white matter structural brain networks. We found that the NDM successfully recaptures subpopulation-level spatial patterns (Pearsons R=0.45-0.48, PFDR < 0.01) of amyloid-{beta} deposition in the Alzheimers Disease Neuroimaging Cohort at a regional level, but with drawbacks in mechanism interpretability. We then moved to an extended NDM framework (eNDM), including a protein synthesis term to better reflect the role of amyloid-{beta} metabolism, as well as including regional vulnerability using spatial transcriptomics from the Allen Human Brain Atlas to modulate the region-level rate parameters of the synthesis term. The novel gene eNDMs exhibited significant performance increases in Pearsons correlation (Steigers Z, PFDR < 0.10) over baseline NDM performance in mild cognitive impairment and AD groups using APOE, SORL1, and FGL2 for gene modulation. The results were robust and replicable when testing on an external cohort of the Alzheimers Disease Sequencing Project. The study thus demonstrates the importance of regional genetic vulnerability, in conjunction with network diffusion mechanisms, in improving the modelling and prediction of amyloid-{beta} pathophysiological spread.

20

SPECTRA: Spatial Inference for Tractometry Toward Precision Mapping of White Matter Microstructure

Feng, Y.; Villalon-Reina, J. E.; Ba Gari, I.; Alibrando, J. D.; Thomopoulos, S.; Liou, K.; Somu, S.; Yoo, H.; Shuai, Y.; Chehrzadeh, S.; Nir, T. M.; Jahanshad, N.; Chandio, B. Q.; Thompson, P. M.

2026-05-13 neuroscience 10.64898/2026.05.08.723622 medRxiv

Top 0.2%

5.0%

Show abstract

Diffusion MRI tractometry characterizes white matter microstructure along fiber bundles, but standard along-tract profiling collapses measurements across the bundle cross-section, obscuring radial heterogeneity and producing spatially inconsistent units of inference. We present SPECTRA (Spatial Inference for Tractometry), a framework designed to address these limitations through a unified design of parameterization and statistical inference. First, we propose a 2D bundle parameterization that extends along-tract profiling to include a radial dimension defined on the atlas bundle. Second, we develop a two-stage hierarchical false discovery rate (hFDR) procedure for multi-bundle inference, which aggregates evidence at a coarser spatial scale before proceeding to finer-grained inference, with spatial scales derived from a Matern kernel. Across extensive simulation conditions, we found that hFDR improves statistical power and reduces the sample size required to detect effects compared to global FDR correction, while maintaining appropriate error control. We further characterized how sensitivity-specificity tradeoffs depend on sample size, the magnitude, spatial extent, and configurations of effects, thereby providing practical guidance for tractometry study design. In an empirical analysis of mild cognitive impairment and dementia in more than 4,000 subjects across 63 bundles, SPECTRA revealed spatially localized patterns that were absent in 1D profiles. Together, these results demonstrate that spatially resolved parameterization and adaptive error control jointly enable precise mapping of white matter microstructure in large-scale tractometry studies. SPECTRA is openly available as a Python package.